qini curve
Qini curve estimation under clustered network interference
Karlsson, Rickard K. A., Akker, Bram van den, Moraes, Felipe, Proença, Hugo M., Krijthe, Jesse H.
Qini curves are a widely used tool for assessing treatment policies under allocation constraints as they visualize the incremental gain of a new treatment policy versus the cost of its implementation. Standard Qini curve estimation assumes no interference between units: that is, that treating one unit does not influence the outcome of any other unit. In many real-life applications such as public policy or marketing, however, the presence of interference is common. Ignoring interference in these scenarios can lead to systematically biased Qini curves that over- or under-estimate a treatment policy's cost-effectiveness. In this paper, we address the problem of Qini curve estimation under clustered network interference, where interfering units form independent clusters. We propose a formal description of the problem setting with an experimental study design under which we can account for clustered network interference. Within this framework, we introduce three different estimation strategies suited for different conditions. Moreover, we introduce a marketplace simulator that emulates clustered network interference in a typical e-commerce setting. From both theoretical and empirical insights, we provide recommendations in choosing the best estimation strategy by identifying an inherent bias-variance trade-off among the estimation strategies.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.93)
- Information Technology > Services (0.48)
- Health & Medicine (0.46)
The impact of heteroskedasticity on uplift modeling
Bokelmann, Björn, Lessmann, Stefan
There are various applications, where companies need to decide to which individuals they should best allocate treatment. To support such decisions, uplift models are applied to predict treatment effects on an individual level. Based on the predicted treatment effects, individuals can be ranked and treatment allocation can be prioritized according to this ranking. An implicit assumption, which has not been doubted in the previous uplift modeling literature, is that this treatment prioritization approach tends to bring individuals with high treatment effects to the top and individuals with low treatment effects to the bottom of the ranking. In our research, we show that heteroskedastictity in the training data can cause a bias of the uplift model ranking: individuals with the highest treatment effects can get accumulated in large numbers at the bottom of the ranking. We explain theoretically how heteroskedasticity can bias the ranking of uplift models and show this process in a simulation and on real-world data. We argue that this problem of ranking bias due to heteroskedasticity might occur in many real-world applications and requires modification of the treatment prioritization to achieve an efficient treatment allocation.
- North America > United States > New Jersey (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (0.69)
- Research Report > New Finding (0.48)
KDSM: An uplift modeling framework based on knowledge distillation and sample matching
Sun, Chang, Li, Qianying, Wang, Guanxiang, Xu, Sihao, Liu, Yitong
Uplift modeling aims to estimate the treatment effect on individuals, widely applied in the e-commerce platform to target persuadable customers and maximize the return of marketing activities. Among the existing uplift modeling methods, tree-based methods are adept at fitting increment and generalization, while neural-network-based models excel at predicting absolute value and precision, and these advantages have not been fully explored and combined. Also, the lack of counterfactual sample pairs is the root challenge in uplift modeling. In this paper, we proposed an uplift modeling framework based on Knowledge Distillation and Sample Matching (KDSM). The teacher model is the uplift decision tree (UpliftDT), whose structure is exploited to construct counterfactual sample pairs, and the pairwise incremental prediction is treated as another objective for the student model. Under the idea of multitask learning, the student model can achieve better performance on generalization and even surpass the teacher. Extensive offline experiments validate the universality of different combinations of teachers and student models and the superiority of KDSM measured against the baselines. In online A/B testing, the cost of each incremental room night is reduced by 6.5\%.
- North America > Montserrat (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education (1.00)
- Information Technology > Services (0.34)
Improving uplift model evaluation on RCT data
Bokelmann, Björn, Lessmann, Stefan
Estimating treatment effects is one of the most challenging and important tasks of data analysts. In many applications, like online marketing and personalized medicine, treatment needs to be allocated to the individuals where it yields a high positive treatment effect. Uplift models help select the right individuals for treatment and maximize the overall treatment effect (uplift). A major challenge in uplift modeling concerns model evaluation. Previous literature suggests methods like the Qini curve and the transformed outcome mean squared error. However, these metrics suffer from variance: their evaluations are strongly affected by random noise in the data, which renders their signals, to a certain degree, arbitrary. We theoretically analyze the variance of uplift evaluation metrics and derive possible methods of variance reduction, which are based on statistical adjustment of the outcome. We derive simple conditions under which the variance reduction methods improve the uplift evaluation metrics and empirically demonstrate their benefits on simulated and real-world data. Our paper provides strong evidence in favor of applying the suggested variance reduction procedures by default when evaluating uplift models on RCT data.
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.67)
Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
Bozorgi, Zahra Dasht, Teinemaa, Irene, Dumas, Marlon, La Rosa, Marcello
Reducing cycle time is a recurrent concern in the field of business process management. Depending on the process, various interventions may be triggered to reduce the cycle time of a case, for example, using a faster shipping service in an order-to-delivery process or giving a phone call to a customer to obtain missing information rather than waiting passively. Each of these interventions comes with a cost. This paper tackles the problem of determining if and when to trigger a time-reducing intervention in a way that maximizes the total net gain. The paper proposes a prescriptive process monitoring method that uses orthogonal random forest models to estimate the causal effect of triggering a time-reducing intervention for each ongoing case of a process. Based on this causal effect estimate, the method triggers interventions according to a user-defined policy. The method is evaluated on two real-life logs.
- Europe > Estonia > Tartu County > Tartu (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Response Transformation and Profit Decomposition for Revenue Uplift Modeling
Gubela, Robin M., Lessmann, Stefan, Jaroszewicz, Szymon
Uplift models support decision-making in marketing campaign planning. Estimating the causal effect of a marketing treatment, an uplift model facilitates targeting communication to responsive customers and efficient allocation of marketing budgets. Research into uplift models focuses on conversion models to maximize incremental sales. The paper introduces uplift modeling strategies for maximizing incremental revenues. If customers differ in their spending behavior, revenue maximization is a more plausible business objective compared to maximizing conversions. The proposed methodology entails a transformation of the prediction target, customer-level revenues, that facilitates implementing a causal uplift model using standard machine learning algorithms. The distribution of campaign revenues is typically zero-inflated because of many non-buyers. Remedies to this modeling challenge are incorporated in the proposed revenue uplift strategies in the form of two-stage models. Empirical experiments using real-world e-commerce data confirm the merits of the proposed revenue uplift strategy over relevant alternatives including uplift models for conver-sion and recently developed causal machine learning algorithms. To quantify the degree to which improved targeting decisions raise return on marketing, the paper develops a decomposition of campaign profit. Applying the decomposition to a digital coupon targeting campaign, the paper provides evidence that revenue uplift modeling, as well as causal machine learning, can improve cam-paign profit substantially.
- North America > United States > New York (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Marketing (1.00)
- Health & Medicine (1.00)
- Banking & Finance (1.00)
- Information Technology > Services (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)